Local Computation of PageRank Contributions

نویسندگان

  • Reid Andersen
  • Christian Borgs
  • Jennifer T. Chayes
  • John E. Hopcroft
  • Vahab S. Mirrokni
  • Shang-Hua Teng
چکیده

Motivated by the problem of detecting link-spam, we consider the following graph-theoretic primitive: Given a webgraph G, a vertex v in G, and a parameter δ ∈ (0, 1), compute the set of all vertices that contribute to v at least a δ fraction of v’s PageRank. We call this set the δ-contributing set of v. To this end, we define the contribution vector of v to be the vector whose entries measure the contributions of every vertex to the PageRank of v. A local algorithm is one that produces a solution by adaptively examining only a small portion of the input graph near a specified vertex. We give an efficient local algorithm that computes an -approximation of the contribution vector for a given vertex by adaptively examining O(1/ ) vertices. Using this algorithm, we give a local approximation algorithm for the primitive defined above. Specifically, we give an algorithm that returns a set containing the δcontributing set of v and at most O(1/δ) vertices from the δ/2-contributing set of v, and which does so by examining at most O(1/δ) vertices. We also give a local algorithm for solving the following problem: If there exist k vertices that contribute a ρ-fraction to the PageRank of v, find a set of k vertices that contribute at least a (ρ − )-fraction to the PageRank of v. In this case, we prove that our algorithm examines at most O(k/ ) vertices.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Local Approximation of PageRank and Reverse PageRank

We consider the problem of approximating the PageRank of a target node using only local information provided by a link server. This problem was originally studied by Chen, Gan, and Suel (CIKM 2004), who presented an algorithm for tackling it. We prove that local approximation of PageRank, even to within modest approximation factors, is infeasible in the worst-case, as it requires probing the li...

متن کامل

Exploiting the Block Structure of the Web for Computing PageRank

The web link graph has a nested block structure: the vast majority of hyperlinks link pages on a host to other pages on the same host, and many of those that do not link pages within the same domain. We show how to exploit this structure to speed up the computation of PageRank by a 3-stage algorithm whereby (1) the local PageRanks of pages for each host are computed independently using the link...

متن کامل

Graph fibrations, graph isomorphism, and PageRank

PageRank is a ranking method that assigns scores to web pages using the limit distribution of a random walk on the web graph. A fibration of graphs is a morphism that is a local isomorphism of inneighbourhoods, much in the same way a covering projection is a local isomorphism of neighbourhoods. We show that a deep connection relates fibrations and Markov chains with restart, a particular kind o...

متن کامل

Asynchronous iterative methods for the effective computation of PageRank

Iterative algorithms are the building blocks of important scientific computations. However their semantics-preserving implementation over modern distributed computing platforms introduces synchronization phases between cooperating tasks. These phases increase overall idle time and put tight upper bounds to performance. A drastic measure would be a total elimination of these phases: Each process...

متن کامل

Web-Site-Based Partitioning Techniques for Efficient Parallelization of the PageRank Computation

The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. PageRank computation includes repeated iterative sparse matrix-vector multiplications. Due to the enourmous size of the Web matrix to be multiplied, PageRank computations are usually carried out on parallel systems. Graph and hypergraph par...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Internet Mathematics

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2007